Chart parsing according to the slot and filler principle
نویسنده
چکیده
A parser is an algorithm that assigns a structural description to a string according to a grammar. It follows from this definition that there are three general issues in parser design: the structure to be assigned, the type of grammar, the recognition algo~ rithm. Common parsers employ phrase structure descriptions, rule-based grammars, and derivation or transition oriented recognition. The following choices result in a new parser: The structure to be assigned to the input is a dependency tree with lexical, morpho-syntactic and functional-syntactic information associated with each node and coded by complex categories which are subject to unification. The grammar is lexicalized, i.e. the syntactical relationships are stated as part of the lexical descriptions of the elements of the language. The algorithm relies on the slot and filler principle in order to draw up complex structures. It utilizes a well-formed substring table (chart) which allows for discontinuous segments. 1. D e p e n d e n c y S t r u c t u r e The structuring principle of constituency trees is concatenation and the part-whole -relationship. The structuring principle of dependency trees is the relationship between lexemes and their complements. Note: It is not correct (or at least misleading) to define dependency as a relationship between words, as it is often done. The possibility and necessity of complements depend on the lexical meaning of words, i.e. a word which denotes a relationship asks for entities which it relates, a word which denotes a modification asks for an entity which it modifies etc. While it is awkward to associate functions (deep cases, roles, grammatical relationships) with phrase structures, it is not difficult to paraphrase the functions of complements on a lexical basis. For example, the argument of the predicate "sleep" denotes the sleeper; the meaning of "persuade" includes the persuader, the persuaded person and the contents of the persuasion. In a next step, one can abstract from the concrete function of dependents and arrive at abstract functions like subject, object, adjunct etc. Of course, the complements covering these roles can be single words as well as large phrases; for example "John", "my father", "the president of the United States" can all fill the role of the sleeper with respect to the predicate "sleep". However, phrases need not be represented by separate nodes in dependency trees (as they do in phrase markers) because their internal structure is again a question of dependency between lexemes and their complements. In a dependency tree, phrases are represented directly by their internal structure, which results in an arc between the superodinated head and the head within the complementary phrase. Nevertheless, the real principle of depen242 dency is a relationship between words and structures, or, formally, between single nodes and trees. Taking this into account, dependency trees are much more appealing than has been recognized so far. In order to restrict linguistic structures according to syntactic and semantic requirements, the use of complex categories is state of the art. Complex categories are sets of parameters (attributes) and values (features). Agreement between entities can be formulated in a general way in terms of parameters; the assignment of actual feature values is achieved by the mechanism of unification. If dependency J.s the relationship along which the catagories are unified, functional=syntactic and mo~ho-syntactic features can be handeled completely in parallel, as opposed to the two-phase mechanism which, for example, characterizes Lexical Functional Grammar. Each element in the dependency tree carries three labels: a role (which applies to the (sub)tree of which the element is the head), a lexeme, and a set of grammatical features. Constituency and dependency both have to be represented somehow or other in the syntactic description. As a consequence, recent developments have led to a convergence of formalisms of both origins with respect to their contents. (A good example is the similarity between Head-Driven Phrase Structure Grammar /Pollard, Sag 1987/ and Dependency Unification Grammar /Hellwig 1986/.) If phrase structure trees are used, the difference between governor and dependent must be denoted by the categories that label the nodes, e.g. by a x-bar notation. If dependency trees are used, the concatenation relationship must be denoted by positional features which are part of the complex morpho-svntactic category. 2. C h a r t p a r s i n g b a s e d on a l ex ica l i zed g r a m m a r The structure to be associated with a wellformed string can be defined in two ways: either by a set of abstract rules which describe the possible constructions of the ~language or by a description of the combi-. nation capabilities of the basic elements. The latter fits with the dependency approach. Given a lexical item and its morphosyntactic properties, it is relatively easy to give a precise description of its possible complements. The main advantage of this lexicalistic approach is the fact that augmenting or changing the description of an item normally does not interfere with the rest while any change in a rule-based grammar might produce unforeseen side effects with regard to the whole. The prerequisite for a lexicalized dependency grammar are trees that comprise slots. A slot is a description of the head of a tree that fits into another tree. Formally, a slot is a direct dependent of a head with a role associated to it, with a variable in the lexeme position, and with a categorization that covers all of the morpho-syntactic properties of the appertaining complement. If cross categorization does not allow all of the p~ssible properties of a complement within one category to be stated, a disjunction of slots is used to express the alternatives. The only mechanism needed for draw-ing up complex structures is the unification of slots and potential fillers. The control of the parsing process is achieved by means of a well-formed substring table ((]hart). It is widely accepted that chart parsing is superior to backtracking or to parallel processing of every path. A common version of a chart can be vizualized as a network of vertices representing points in the input, linked by edges representing segments. The edges are labelled with the categories that the parser has assigned to the constituents concerned. Alternatively, each edge is associated with a complete structural descrLption, including the information which is carried by the covered edges. In this case, a chart is simply a collect]on of trees (implemented as lists) projected on the various segments in the input. The innovation with regard to chart parsing th~vt is proposed in this paper is a label.ling of edges by trees that comprise slots. At the beginning, an edge for each word is entered into the chart. Each edge is label] o~ ed with a tree. The head of this tree contains the lexeme that is associated with the word according to the ].exicon; it carries a morpho-syntactic category according to the morphological properties of the word in question: it normally contains a variab].e as a role l~arker, since the syntagmatic function of the corresponding segment is still unknown. A slot is subordinated to the head for each element that is to be dependent in the resulting structure, if any. Slots are added to a lexical item according to c~>mpletion patterns refered to in the lexicon. (We can not qo into details here.) Subsequently, each tree in the chart looks for a slot in a "tree that is associated with annother edge. If the head of the searching tree fitn the description in the slot then a new edge is drawn and labelled with the compound tree that results from inserting the first tree into the second. The categories of the ~ew tree are the result of unifying the categories of the slot tree and the filler tree. Special features state the positional re~/irements, e.g. whether the segment corresponding to the filler has to preceed or to follow of the segment corresponding to the element dominating the slot. This process continues until no new tree is produced. Parsing was successful if at ].east one edge covers the whole input. The dependency tr~e associated with this edge is the desired structural description. The fo].lowing example illustrates the m e -
منابع مشابه
Unsupervised Person Slot Filling based on Graph Mining
Slot filling aims to extract the values (slot fillers) of specific attributes (slots types) for a given entity (query) from a largescale corpus. Slot filling remains very challenging over the past seven years. We propose a simple yet effective unsupervised approach to extract slot fillers based on the following two observations: (1) a trigger is usually a salient node relative to the query and ...
متن کاملRobust Parsing of Utterances i
The rapidly increasing number of spoken-dialogue systems have led to numerous robust parsers for limited domains having been constructed during the last decade. By “parsing” we here mean a mapping from the input utterance to a context-independent semantic representation. By “robust” we mean that the parser will give a reasonable result even on very noisy input. In simple spoken-dialogue applica...
متن کاملRobust parsing of utterances in negotiative dialogue
The rapidly increasing number of spoken-dialogue systems have led to numerous robust parsers for limited domains having been constructed during the last decade. By “parsing” we here mean a mapping from the input utterance to a context-independent semantic representation. By “robust” we mean that the parser will give a reasonable result even on very noisy input. In simple spoken-dialogue applica...
متن کاملBootstrapping Knowledge Base Acceleration
The Streaming Slot Filler (SSF) task in TREC Knowledge Base Acceleration track involves detecting changes to slot values (relations) over time. To handle this task, the system needs to extract relations to identify slot-filler values and detect novel values. Being the first attempt at KBA, the biggest challenge that we faced was the scale of the data. We present the approach used by University ...
متن کاملSemantic Interpretation Using KL-ONE
case frames have significantly eased the development and expansion of semantic coverage within our application by helping us to focus on issues of generality and speciiicity. The new frames we add have many slots established by inheritance; consistency has been easier to maintain; and the structure of the resulting syntaxonomy has helped in debugging. 5. Semantically Neutral Terms Case frames a...
متن کامل